15 research outputs found

    Solving Nonlinear Optimal Control Problems using a Hybrid IPSO-SQP Algorithm

    No full text
    A hybrid algorithm by integrating an improved particle swarm optimization (IPSO) with successive quadratic programming (SQP), namely IPSO-SQP, is proposed for solving nonlinear optimal control problems. The particle swarm optimization (PSO) is showed to converge rapidly to a near optimum solution, but the search process will become very slow around global optimum. On the contrary, the ability of SQP is weak to escape local optimum but can achieve faster convergent speed around global optimum and the convergent accuracy can be higher. Hence, in the proposed method, at the beginning stage of search process, a PSO algorithm is employed to find a near optimum solution. In this case, an improved PSO (IPSO) algorithm is used to enhance global search ability and convergence speed of algorithm. When the change in fitness value is smaller than a predefined value, the searching process is switched to SQP to accelerate the search process and find an accurate solution. In this way, this hybrid algorithm may find an optimum solution more accurately. To validate the performance of the proposed IPSO-SQP approach, it is evaluated on two optimal control problems. Results show that the performance of the proposed algorithm is satisfactory

    Improving of Multivariable PI Controller with a High Gain Structure for an Irregular System by Genetic Algorithm

    No full text
    This paper describes an optimal design for multivariable PI controller with a high gain structure for an irregular system by genetic algorithm. PI controllers with a high gain structure leads to the asymptotic decomposition of the fast and slow modes in the closed loop system that have unique characteristics. The slow modes are asymptotically uncontrollable and unobservable; therefore, they have not role in input and output behavior. The closed-loop response is affected only from rapid poles; therefore, the system response will have quick behavior. An essential requirement of this design is that the first Markov parameter of multivariable system (the matrix product CB) must have full rank. If the CB matrix is not full rank, the measurement matrix (M) is used with internal feedback. In this structure, the measurement matrix is chosen using genetic algorithm in order to reach the stable closed-loop system and minimize interference between outputs. The research is implemented on the two kind of different systems. The results show that the response time of PI controller with a high gain structure by genetic algorithms has good behavior in comparison with other methods

    Parameter Estimation of Bilinear Systems Based on an Adaptive Particle Swarm Optimization

    No full text
    Bilinear models can approximate a large class of nonlinear systems adequately and usually with considerable parsimony in the number of coefficients required. This paper presents the application of Particle Swarm Optimization (PSO) algorithm to solve both offline and online parameter estimation problem for bilinear systems. First, an Adaptive Particle Swarm Optimization (APSO) is proposed to increase the convergence speed and accuracy of the basic particle swarm optimization to save tremendous computation time. An illustrative example for the modeling of bilinear systems is provided to confirm the validity, as compared with the Genetic Algorithm (GA), Linearly Decreasing Inertia Weight PSO (LDW-PSO), Nonlinear Inertia Weight PSO (NDW-PSO) and Dynamic Inertia Weight PSO (DIW-PSO) in terms of parameter accuracy and convergence speed. Second, APSO is also improved to detect and determine varying parameters. In this case, a sentry particle is introduced to detect any changes in system parameters. Simulation results confirm that the proposed algorithm is a good promising particle swarm optimization algorithm for online parameter estimation

    A Policy Iteration Approach to Online Optimal Control of Continuous-Time Constrained-Input Systems

    No full text
    This paper is an effort towards developing an online learning algorithm to find the optimal control solution for continuous-time (CT) systems subject to input constraints. The proposed method is based on the policy iteration (PI) technique which has recently evolved as a major technique for solving optimal control problems. Although a number of online PI algorithms have been developed for CT systems, none of them take into account the input constraints caused by actuator saturation. In practice, however, ignoring these constraints leads to performance degradation or even system instability. In this paper, to deal with the input constraints, a suitable nonquadratic functional is employed to encode the constraints into the optimization formulation. Then, the proposed PI algorithm is implemented on an actor-critic structure to solve the Hamilton-Jacobi-Bellman (HJB) equation associated with this nonquadratic cost functional in an online fashion. That is, two coupled neural network (NN) approximators, namely an actor and a critic are tuned online and simultaneously for approximating the associated HJB solution and computing the optimal control policy. The critic is used to evaluate the cost associated with the current policy, while the actor is used to find an improved policy based on information provided by the critic. Convergence to a close approximation of the HJB solution as well as stability of the proposed feedback control law are shown. Simulation results of the proposed method on a nonlinear CT system illustrate the effectiveness of the proposed approach

    Adaptive Optimal Control of Unknown Constrained-Input Systems using Policy Iteration and Neural Networks

    No full text
    This paper presents an online policy iteration (PI) algorithm to learn the continuous-time optimal control solution for unknown constrained-input systems. The proposed PI algorithm is implemented on an actor-critic structure where two neural networks (NNs) are tuned online and simultaneously to generate the optimal bounded control policy. The requirement of complete knowledge of the system dynamics is obviated by employing a novel NN identifier in conjunction with the actor and critic NNs. It is shown how the identifier weights estimation error affects the convergence of the critic NN. A novel learning rule is developed to guarantee that the identifier weights converge to small neighborhoods of their ideal values exponentially fast. To provide an easy-to-check persistence of excitation condition, the experience replay technique is used. That is, recorded past experiences are used simultaneously with current data for the adaptation of the identifier weights. Stability of the whole system consisting of the actor, critic, system state, and system identifier is guaranteed while all three networks undergo adaptation. Convergence to a near-optimal control law is also shown. The effectiveness of the proposed method is illustrated with a simulation example

    Intelligent fading memory for high maneuvering target tracking

    No full text
    status: publishe

    Online Solution of Nonquadratic Two-Player Zero-Sum Games Arising in the H\u3csub\u3e∞\u3c/sub\u3e Control of Constrained Input Systems

    No full text
    In this paper, we present an online learning algorithm to find the solution to the H∞ control problem of continuous-time systems with input constraints. A suitable nonquadratic functional is utilized to encode the input constraints into the H∞ control problem, and the related H∞ control problem is formulated as a two-player zero-sum game with a nonquadratic performance. Then, a policy iteration algorithm on an actor-critic-disturbance structure is developed to solve the Hamilton-Jacobi-Isaacs (HJI) equation associated with this nonquadratic zero-sum game. That is, three NN approximators, namely, actor, critic, and disturbance, are tuned online and simultaneously for approximating the HJI solution. The value of the actor and disturbance policies is approximated continuously by the critic NN, and then on the basis of this value estimate, the actor and disturbance NNs are updated in real time to improve their policies. The disturbance tries to make the worst possible disturbance, whereas the actor tries to make the best control input. A persistence of excitation condition is shown to guarantee convergence to the optimal saddle point solution. Stability of the closed-loop system is also guaranteed. A simulation on a nonlinear benchmark problem is performed to validate the effectiveness of the proposed approach
    corecore